Recycling Annotated Parallel Corpora for Bilingual Document Composition
Identifieur interne : 000318 ( Main/Exploration ); précédent : 000317; suivant : 000319Recycling Annotated Parallel Corpora for Bilingual Document Composition
Auteurs : Arantza Casillas [Espagne] ; Joseba Abaitua [Espagne] ; Raquel Martinez [Espagne]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2000.
Abstract
Abstract: Parallel corpora enriched with descriptive annotations facilitate multilingual authoring development. Departing from an annotated bitext we show how SGML markup can be recycled to produce complementary language resources. On the one hand, several translation memory databases together with glossaries of proper nouns have been produced. On the other, DTDs for source and target documents have been derived and put into correspondence. This paper discusses how these resources have been automatically generated and applied to an interactive bilingual authoring system. This tool is capable of handling a substantial proportion of text both in the composition and translation of structured documents.
Url:
DOI: 10.1007/3-540-39965-8_12
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000099
- to stream Istex, to step Curation: 000099
- to stream Istex, to step Checkpoint: 000266
- to stream Main, to step Merge: 000344
- to stream Main, to step Curation: 000318
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct:series"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Recycling Annotated Parallel Corpora for Bilingual Document Composition</title>
<author><name sortKey="Casillas, Arantza" sort="Casillas, Arantza" uniqKey="Casillas A" first="Arantza" last="Casillas">Arantza Casillas</name>
</author>
<author><name sortKey="Abaitua, Joseba" sort="Abaitua, Joseba" uniqKey="Abaitua J" first="Joseba" last="Abaitua">Joseba Abaitua</name>
</author>
<author><name sortKey="Martinez, Raquel" sort="Martinez, Raquel" uniqKey="Martinez R" first="Raquel" last="Martinez">Raquel Martinez</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:4F515A6D637D7CE9AC72B228D713AA6632989C07</idno>
<date when="2000" year="2000">2000</date>
<idno type="doi">10.1007/3-540-39965-8_12</idno>
<idno type="url">https://api.istex.fr/document/4F515A6D637D7CE9AC72B228D713AA6632989C07/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000099</idno>
<idno type="wicri:Area/Istex/Curation">000099</idno>
<idno type="wicri:Area/Istex/Checkpoint">000266</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Checkpoint">000266</idno>
<idno type="wicri:doubleKey">0302-9743:2000:Casillas A:recycling:annotated:parallel</idno>
<idno type="wicri:Area/Main/Merge">000344</idno>
<idno type="wicri:Area/Main/Curation">000318</idno>
<idno type="wicri:Area/Main/Exploration">000318</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Recycling Annotated Parallel Corpora for Bilingual Document Composition</title>
<author><name sortKey="Casillas, Arantza" sort="Casillas, Arantza" uniqKey="Casillas A" first="Arantza" last="Casillas">Arantza Casillas</name>
<affiliation wicri:level="1"><country xml:lang="fr">Espagne</country>
<wicri:regionArea>Departamento de Automática, Universidad de Alcalá</wicri:regionArea>
<wicri:noRegion>Universidad de Alcalá</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Espagne</country>
</affiliation>
</author>
<author><name sortKey="Abaitua, Joseba" sort="Abaitua, Joseba" uniqKey="Abaitua J" first="Joseba" last="Abaitua">Joseba Abaitua</name>
<affiliation></affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Espagne</country>
</affiliation>
</author>
<author><name sortKey="Martinez, Raquel" sort="Martinez, Raquel" uniqKey="Martinez R" first="Raquel" last="Martinez">Raquel Martinez</name>
<affiliation wicri:level="4"><country xml:lang="fr">Espagne</country>
<wicri:regionArea>Depatamento de Sis. Informáticos y Programación, Facultad de Matemáticas, Universidad Complutense de Madrid</wicri:regionArea>
<placeName><settlement type="city">Madrid</settlement>
<region nuts="2" type="region">Communauté de Madrid</region>
</placeName>
<orgName type="university">Université complutense de Madrid</orgName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Espagne</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2000</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">4F515A6D637D7CE9AC72B228D713AA6632989C07</idno>
<idno type="DOI">10.1007/3-540-39965-8_12</idno>
<idno type="ChapterID">12</idno>
<idno type="ChapterID">Chap12</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: Parallel corpora enriched with descriptive annotations facilitate multilingual authoring development. Departing from an annotated bitext we show how SGML markup can be recycled to produce complementary language resources. On the one hand, several translation memory databases together with glossaries of proper nouns have been produced. On the other, DTDs for source and target documents have been derived and put into correspondence. This paper discusses how these resources have been automatically generated and applied to an interactive bilingual authoring system. This tool is capable of handling a substantial proportion of text both in the composition and translation of structured documents.</div>
</front>
</TEI>
<affiliations><list><country><li>Espagne</li>
</country>
<region><li>Communauté de Madrid</li>
</region>
<settlement><li>Madrid</li>
</settlement>
<orgName><li>Université complutense de Madrid</li>
</orgName>
</list>
<tree><country name="Espagne"><noRegion><name sortKey="Casillas, Arantza" sort="Casillas, Arantza" uniqKey="Casillas A" first="Arantza" last="Casillas">Arantza Casillas</name>
</noRegion>
<name sortKey="Abaitua, Joseba" sort="Abaitua, Joseba" uniqKey="Abaitua J" first="Joseba" last="Abaitua">Joseba Abaitua</name>
<name sortKey="Casillas, Arantza" sort="Casillas, Arantza" uniqKey="Casillas A" first="Arantza" last="Casillas">Arantza Casillas</name>
<name sortKey="Martinez, Raquel" sort="Martinez, Raquel" uniqKey="Martinez R" first="Raquel" last="Martinez">Raquel Martinez</name>
<name sortKey="Martinez, Raquel" sort="Martinez, Raquel" uniqKey="Martinez R" first="Raquel" last="Martinez">Raquel Martinez</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Ticri/explor/TeiVM2/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000318 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000318 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Ticri |area= TeiVM2 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:4F515A6D637D7CE9AC72B228D713AA6632989C07 |texte= Recycling Annotated Parallel Corpora for Bilingual Document Composition }}
This area was generated with Dilib version V0.6.31. |